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Background / Context: 

Description of prior research and its intellectual context. 

Researchers planning a randomized field trial to evaluate the effectiveness of an educational 
intervention often face the following dilemma. They plan to recruit schools to participate in their 
study. The question is, “Should the researchers randomly assign individuals (either students or 
teachers, depending on the intervention) within schools to treatment conditions, or should all 
participating students at a given school be assigned to the same treatment condition?”. That is, 
should we randomize schools (clusters), or individuals within schools? 

One reason often given for preferring cluster level randomization is a fear of “diffusion of 
treatment” (Raudenbush, 1997) or “contamination” (Donner and Klar, 2000). Contamination 
occurs when contact between members of the control group and members of the experimental 
group causes control group participants to behave more like experimental group participants than 
they would have had that contact not occurred. It is also possible for certain interventions that 
contamination could cause experimental subjects to behave more like control group subjects than 
they would otherwise. Note that both experimental subjects acting more like control subjects 
and control subjects acting like experimental subjects are processes that would tend to decrease 
the effect size of the experiment. I assume for the purposes of this paper that the contamination 
dilutes the observed effect size in this fashion. The methods considered would not apply to a 
contamination process that tends to spuriously increase the effect size, such as control group 
demoralization (Shadish, Cook & Campbell, 2002). 

Cornfield (1978) noted that two penalties are paid for randomization by cluster rather than by 
individual. First, the variance of the estimated treatment effect increases. Second, the degrees of 
freedom available to estimate that variance decrease. Thus, in the absence of contamination, 
randomizing an equal number of individuals within each cluster to each treatment (often called a 
randomized block or RB design) is a more powerful design than randomizing whole clusters 
(often called a cluster randomized or CR design). Rhoads (forthcoming) has argued that the 
threat of contamination should not necessarily lead experimenters to opt for a cluster randomized 
design. He points out that, depending on the values of relevant design parameters (i.e., the ICC, 
within cluster sample size, the heterogeneity in treatment effects across clusters, and the number 
of clusters in the experiment) the statistical power of a randomized block design remains higher 
than the power of a cluster randomized design even when contamination causes the effect size to 
decrease by as much as 10-60%. Similarly, from the standpoint of mean squared error, Rhoads 
(forthcoming) shows that for many design parameters of practical interest the randomized block 
design will be preferred to the cluster randomized design. 

However, it may well be the case that the optimal experimental design in the presence of 
contamination is neither the RB design nor the CR design, but some compromise between the 
two designs. This is precisely the situation considered by Borm, Melis, Teerenstra and Peer 
(2005) (hereafter BMTP) who suggest an interesting compromise between cluster randomization 
and individual randomization, a method they call “pseudo cluster randomization.” They label 
the two treatments that are under investigation as treatments A and B. Suppose that 2m clusters. 



2011 SREE Conference Abstract Template 



1 




each of size n, are available for the experiment. Pseudo cluster randomization is a two step 
randomization procedure. First, the clusters are randomly assigned to two groups labeled a and 
b, resulting in m clusters per group. Then, for each cluster in group a, a fraction /( 0.5 < / < 1 ) 
of the individuals in that cluster are randomly assigned to treatment A and the rest to treatment 
B. In cluster group b, the same fraction / of individuals in each cluster are assigned to treatment 
B and the rest to treatment A. Using the same fraction/in each cluster group ensures that the 
design is balanced on both the individual and the cluster level. 

Next, define and to be the mean outcome of individuals assigned to treatment A in 

cluster groups a and b, respectively. Notation for the mean outcomes of those receiving 
treatment B are defined analogously. It may be advantageous to weight responses in the 
different cluster groups differently. Thus, BMTP propose the following estimator of the mean 
outcome for those receiving A 



+1/ "g ^1- ^ 

Then the variance of the estimated treatment difference is 

_ 2o^ f + w\\-f) + nq{f-w{l-f)Y 

V cil _ • 

mn {f + w{\-f)} 

where q is defined by q= p /(I - p ) , and p is the intracluster correlation coefficient. 



( 1 ) 



( 2 ) 



Contamination is conceptualized as follows. Let and be population mean outcomes 
under treatments A and B in the absence of contamination and let d = p^ - p^ . Then the 
expected outcome for those receiving treatment A in group aisp^- ( 0 < < 1 ). This is 

the contamination of A by B given that a fraction,/ receive A in each cluster. Defining other 
contamination factors in an analogous fashion we find that the expected value of y^ - y^ with 
contamination is given by 



E -5 S ) 

/ + w(l-/) 



( 3 ) 



BMTP note that the ratio ! var is inversely proportional to the number of subjects 

required to achieve fixed type I and type II error rates using z critical values. Hence, from the 
standpoint of statistical power one would prefer experimental designs with larger values of f' . 

Purpose / Objective / Research Question / Focus of Study: 

Description of the focus of the research. 

This research focuses on extending results found in BMTP (2005) to the following cases: (1) the 
case where the value of the ICC cannot be estimated precisely, (2) the case where treatment 
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effects vary across clusters, (3) the case where the criterion function for determining an optimal 
design/estimator is mean squared error, rather than statistical power. 

Significance / Novelty of study: 

Description of what is missing in previous work and the contribution the study makes. 

There are three major limitations to the work done by BMTP that the current work overcomes. 
First, BMTP assume a large number of clusters enrolled in the experiment and that the value of 
p is known. The current work explores this issue. That is, it looks at what can be done when the 
value of p cannot be estimated precisely. Second, the BMTP work assumes homogeneous 
treatment effects across clusters. This assumption appears to be standard in the health sciences 
literature, however, it is not recommended in educational experiments where it is possible, if not 
likely, that the effect of treatment may vary across different educational settings. Third, BMTP 
assume that the researcher will always prefer the design with more statistical power. However, 
much recent work has recognized that researchers make a serious mistake when they summarize 
the results of their work solely in terms of the results of significance tests (Schmidt and Hunter, 
1997; Ziliak and McCloskey, 2004). Hence, this paper compares the RB and CR designs both 
using statistical power as the criterion for deciding between the designs and using the mean 
squared error of the point estimate of the average treatment effect as the criterion. The results 
section will show that these two criteria do not always lead to the same decision. 

Statistical, Measurement, or Econometric Model: 

Description of the proposed new methods or novel applications of existing methods. 

The statistical model assumed throughout the paper is as follows: 

Yijk = p + ai + l3j + {a/3)jj + Eiji^i= 1,2; j = k = 1,..., n ni j+ U 2 j= n . (4) 

where p and are fixed parameters representing, respectively, the overall mean and the 
deviation of the mean in treatment group i from the overall mean, p. is mean zero normally 
distributed random effect associated with cluster j having variance cr^ , is a mean zero 

normally distributed random effect representing the interaction of the treatment effect with the 
cluster effect and having variance , and £^j.is a mean zero, normally distributed random 

effect associated with individual k within cluster j and has variance . Many results will 
depend on two relevant summary quantitites. First, the intracluster correlation coefficient (ICC), 
defined as p = aj/ (crj + a^) . Second, the parameters = crflal , which represents the ratio of 
one half of the variation in treatment effects across clusters to the total variation across clusters. 

Usefulness / Applicability of Method: 

Demonstration of the usefulness of the proposed methods using hypothetical or real data. 

I believe that the usefulness of the methods is evident from the results presented in the next 
section. 

Findings / Results: 

Description of the main findings with specific details. 
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There are three distinct types of results, corresponding to the three ways in which this work 
generalizes the work of BMTP. First, I present results comparing the RB design and the CR 
design when mean squared error (MSB) is used as the criterion to choose between these designs. 
Results are contrasted with the results that would be achieved if statistical power were used as 
the criterion. Results are reported in terms of MAC, or maximum allowable contamination, 
which is defined as the amount of contamination that could be tolerated before a CR design 
would be preferred to a RB design. When statistical power is used as the criterion, the formula 
defining maximum allowable contamination is 



MAC^,^,(n,p,m) = l- 



\ + {n(ol2-\)p 
\ + {n-\)p 



( 5 ) 



When MSB is used as the criterion, the MAC formula is given by 



MSE,het^^’ P t) ~ 





( 6 ) 



The quantity dj is defined as dj = d ! ^Jcr^c + <7^ + cr^ . Tables 1 and 2 evaluate the formulae 

given in equations (5) and (6) for various values of the relevant design parameters. Table 1 
presents results for the case co =0 (homogeneous treatment effects), and table 2 presents results 
for various positive values of co . The first three rows of each table present the corresponding 
results when statistical power is used as a criterion. The tables demonstrate that these two 
criteria can lead to very different results. 



The next set of results look at what sort of results can be obtained when pseudo cluster 
randomization is used but the value of the ICC cannot be assumed to be known. I note from 
equation (2) that if w=f/{l-f) is used as the weighting function to form the estimator of the 
average treatment effect, then the variance of the estimated treatment effect does not depend on 
p . This is useful because frequently can be estimated with much more precision than p . I 

will refer to a treatment effect estimate constructed with weights w=f/{l-f) in as a weighted- 
invariant (WI) estimator. The expected value of the estimated treatment effect for the WI 
estimator will depend on the average contamination in each treatment at fractional randomization 
f, = (c ^ Cj_ ^ c ^ Cj_ s) / 2 . While BMTP allow all values of c to vary from 0 to 1 in 

an unrestricted fashion, this would imply that contamination could cause all control subjects to 
behave as though they were uncontaminated experimental subjects and all experimental subjects 
to behave as though they were uncontaminated control subjects, resulting in the absolute value of 
8 staying the same but <5 changing sign. This seems implausible, and so the restrictions 
Cf ^ + Cj__^ ^ < 1 seem logical. The degrees of freedom of the WI estimator are 



the same as the degrees of freedom for an RB design, so a very good approximation to the 
relative sample size required under the two designs is given by 



C 4/(l-/)(l-c,„)^- 



(V) 
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Since /(I - /) is at most 0.25, in the absenee of eontamination we obtain the well known 
result that a balaneed experimental design is the most effieient. The WI design ean improve on 
the 1:1 design only if we ean remove a substantial amount of eontamination by using/ > 0.5. For 
instanee, suppose that we believe that experimental subjeets are unlikely to be eontaminated by 
eontrol subjeets and that at 1:1 randomization the eontrol group mean will move halfway towards 
the experimental group mean. That is, we think Cq ^ ^ =0 and Cq j ^ = 0.5 . Now suppose that at a 

fraetional alloeation/=0.8 experimental subjeets are still uneontaminated by eontrol subjeets, 
when 80% of the subjeets in a cluster are in the experimental group eontamination of eontrols 
inereases to = 0.6 , however when only 20% are in the experimental group, contamination 

decreases to g ^ =0.1. Then equation (6) results in 0.925 and the WI design would be 
preferred. On the other hand, if Cq g ^ = 0.2 the 1:1 design is preferred. 

The WI estimator will be preferred only if the right hand side of equation (7) is less than one. 
Henee, given an antieipated amount of eontamination in the RB design, we ean determine how 
mueh less average contamination would need to be in order to want to use a fraction allocated to 
treatment that is other than 14. We do this by setting equation (7) equal to one and solving for 
This results in 



MAC,„-1- . (8) 

24 m- f) 

Evaluations of equation (8) at representative values of f and are given in table 3. 

Finally, I present a formula for the varianee of the treatment effeet estimator when treatment 
effeets differ for different elusters. Under this model expeeted value ealeulations are unehanged, 
but the variance of the estimated treatment differenee under pseudo eluster 

randomization beeomes 

2g^ f + w\\-f) + nq{f-wi\-f)Y+ nqco{f +w\\- ff) 

het ( /• z-i ^ ' 

mn {/ + w(l - /)} 

I note that under the heterogeneous treatment effects model there no longer exists a weight w 
that will result in a variance not depending on the unknown p andm parameters. In the interest 
of spaee, no further ealeulations are presented with equation (9). 



Conclusions: 

Description of conclusions, recommendations, and limitations based on findings. 

The eurrent study shows how the results given in the paper by BMTP ean be generalized to the 
ease where there are heterogeneous treatment effeets. It further shows that when eomparing 
eluster randomized and randomized bloek designs in the presenee of eontamination, one must 
pay eareful attention to whether statistieal power is the eriterion for evaluating the design or if 
mean squared error is the eriterion. Finally, it is shown that in pseudo-eluster randomized 
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designs it is possible to ereate an estimator of the treatment effeet whose varianee does not 
depend on the unknown value of the ICC. One sueh design is the usual randomized bloek 
design. It is shown that one would ehoose to alloeate an unequal number of individuals to 
treatment and control within each cluster only if the average contamination is expected to 
decrease substantially as a result. The further away from 1 : 1 allocation one goes, the more 
contamination must be reduced in order to prefer the unequal allocation. 
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Table 1: values. Homogeneous treatment effeets and MSB as evaluation eriterion. 
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Table 2: MAC values. Heterogeneous treatment effeets and MSB as evaluation eriterion. 
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Table 3: Values of MAC 
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0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


f = 0.5 


0.1 


0.2 


0.3 


0.4 


0.5 


0.6 


0.7 


f = 0.6 


0.081 


0.184 


0.286 


0.388 


0.49 


0.592 


0.694 


f = 0.7 


0.018 


0.127 


0.236 


0.345 


0.454 


0.564 


0.673 


f = 0.8 


< 0 


0 


0.125 


0.25 


0.375 


0.5 


0.625 


f = 0.9 


< 0 


< 0 


< 0 


0 


0.167 


0.333 


0.5 


f = 0.95 


< 0 


< 0 


< 0 


< 0 


< 0 


0.082 


0.312 
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